Extracting strong measurement noise from stochastic series: 
applications to empirical data 
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It is a big challenge in the analysis of experimental data to disentangle the unavoidable measurement noise 
from the intrinsic dynamical noise. Here we present a general operational method to extract measurement noise 
from stochastic time series, even in the case when the amplitudes of measurement noise and uncontaminated 
signal are of the same order of magnitude. Our approach is based on a recently developed method for a nonpara- 
metric reconstruction of Langevin processes. Minimizing a proper non-negative function the procedure is able 
to correctly extract strong measurement noise and to estimate drift and diffusion coefficients in the Langevin 
equation describing the evolution of the original uncorrupted signal. As input, the algorithm uses only the 
two first conditional moments extracted directly from the stochastic series and is therefore suitable for a broad 
panoply of different signals. To demonstrate the power of the method we apply the algorithm to synthetic as 
well as climatological measurement data, namely the daily North Atlantic Oscillation index, shedding new light 
on the discussion of the nature of its underlying physical processes. 



Q 

U 
d 



oo 
(N 

(N 

o 



PACS numbers: 05.40.Ca, 02.50.Ey, 92.70.Gt 

Keywords: Measurement noise, Stochastic processes, Climate change 



I. INTRODUCTION 

Recently, much effort has been made to uncover the dy- 
namical process underlying a given time series of scale and 
time dependent complex systems fj], 0, Ht]. In many cases it 
is possible to describe such systems by a Langevin equation, 
extracted directly from the data, which separates the deter- 
ministic and stochastic processes inherent to the system^. 
Such an approach has already been carried out successfully for 
instance for data from turbulent fluid dynamics JH], financial 
dataifl, climate indices|0,[^] and for electroencephalographic 
recordings from epilepsy patients H |T(J and additional im- 
provements were proposed to address the case of low sam- 
pling ratesEl . 

However, typically the signal is subject to noise, due to 
experimental constraints or due to the measurement or dis- 
cretization procedure leading to the data set to be studied. 
Such noise is not intrinsic to the system, differing from what 
is known as dynamical noise, and therefore one is interested 
to separate it from the stochastic process. We call such non- 
intrinsic noise measurement noise. To separate the measure- 
ment noise from the dynamics of the measured variable dif- 
ferent predictor models or schemes for noise reduction may 
be usedfjl Ht]. In this context, an alternative procedure has 
been proposedHH to extract the intrinsic dynamics associ- 
ated with Langevin processes strongly contaminated by mea- 
surement noise, based solely on the two conditional moments 
directly calculated from the data ! 12l 1 1311 . 

In this manuscript we will revisit this nonparametric pro- 
cedure, describing it in detail and explaining the main steps 
for its implementation, with the aim of applying it to empir- 
ical data sets. Let us consider a one-dimensional Langevin 
process x(t) (an extension to more dimensions is straightfor- 



ward) defined as 



dx 
~dt 



+ y/D 2 (x)T t , 



(1) 



where Tt represents a Gaussian ^-correlated white noise 
(r t ) = and (r t r t /) = S(t - t'). Functions D x (x) and 
£>2 (x) are the drift and diffusion coefficients defined as 



D n (x) 



— lim -M n (x,r) 



(2) 



for n = 1,2, where M n (x,r) denotes the n-th order con- 
ditional moment of the data, as explained below. Further, 
we consider that x(t) is 'contaminated' by a Gaussian 5- 
correlated measurement white noise, which leads to the series 
of observations 



y(t) = x(t) + a((t) 



(3) 



where a denotes the amplitude of the measurement noise. 

When there is no measurement noise (a = 0), Eq. ® yields 
the particular case y(t) = x(t), and the evolution equation 
underlying the signal can be extracted directly from the two 
conditional moments (n = 1,2) 



M n ( yi ,r) = ((y(t + T)-y(t)) n )\ v{t)=yi 



(4) 

as described in Refs. 0,11 110. 

In the presence of measurement noise (<r ^ 0) the condi- 
tional moments depend on x, r and a. Since generally the 
limit 



lim M n (x,a ^ 0, r) 



(5) 



does not exist, Eq. (3) cannot be applied. The aim of this 
paper, however, is to explicitly derive a procedure which can 



2 



transform the functional form of the 'noisy conditional mo- 
ments' Mi(x, a, t) and M 2 (x, a, r) at small r into the 'true' 
coefficients D\ (x) and D 2 (x) and simultaneously retrieve the 
amplitude a of the associated measurement noise. For that, 
we show that M n (y, r) for fixed y is typically linear in r for a 
certain range [n, T2] of values (see Fig.[4]below). Therefore, 
even when a ^ one can estimate the quantities 



D n (y) 



M n {y,T 2 ) - M n (y,ri) 
n!(T 2 - n) 



(6) 



We start in Sec. [II] by briefly describing the procedure to 
extract Langevin equations from data sets and show how the 
drift and diffusion coefficients depend on the measurement 
noise strength a. In particular, we will see that the proposed 
estimate 111411 does not yield the correct value when the mea- 
surement noise is too strong. In Sec. [Ill] we then proceed to 
minimize a proper least square function using the Levenberg- 
Marquardt procedure II 151 . By applying this algorithm to syn- 
thetic data we show that indeed this approach is able to reli- 
ably extract the noise amplitude even in cases where it is of the 
same order as the synthetic signal without noise. Furthermore, 
the procedure yields simultaneously more accurate estimates 
for the clean signal x(t). Finally, in Sec. [IV] we apply this 
framework to an empirical data set, namely the North Atlantic 
Oscillation daily indexfljl, giving some insight from the ob- 
tained results to the underlying system. Discussion and con- 
clusions are given in Sec. [V] where further possible applica- 
tions are proposed. All details concerning the implementation 
of the minimization procedure to extract strong measurement 
noise are given as appendices. 



II. STOCHASTIC TIME SERIES WITH STRONG 
MEASUREMENT NOISE 

We consider a time series generated by integrating Eq. (fl~|i 
with drift and diffusion coefficient assumed to be linear and 
quadratic forms respectively 



Di(x) = d w 
D 2 (x) = d 20 



dux 



d 2 \x + d 22 x 2 



(7a) 



(7b) 



and by adding separately to each data point the measurement 
term <r£(t) in Eq. ([3). Though we concentrate on the par- 
ticular expressions for D\ and D 2 given above, it should be 
stressed that they comprehend a large collection of different 
processes, such as Ornstein-Uhlenbeck processes lO- Fur- 
ther, some generalizations may be carried out as will be dis- 
cussed in Sec. [V] Using Eqs. ([7^) and 05), one has six pa- 
rameters: five coefficients dij defining the evolution equation 
of the clean signal and a sixth parameter a for the amplitude 
of the measurement noise. 

Figure Q] illustrates this influence of noise for a particular 
choice of Di(x), D 2 (x). As shown in Fig.QJ, for increasing 
<7 one obtains broader probability density functions P(y) as 
one intuitively expects. Quantitatively, the standard deviation 
6 of P(y) varies quadratically with the measurement noise a, 




FIG. 1: Langevin time series with different measurement noise 
strengths. Here we show (a) the probability density function P{y) 
of the series with noise (see Eq. l[3]l), with the corresponding mean 
value fi and standard deviation 6 in the inset, and the corresponding 
functions (b) Di(y) and (c) 1)2(2/), see El- ©• m a U cases, the 
assumed time series x(t) without measurement noise uses the coef- 
ficients Di(x) = 1 — x and D%(x) = 1 — x + x 2 . 
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FIG. 2: Noise dependence of functions Di(y) and £>2 (y) (see text 
and Eq. (|6]l) The underlying Langevin time series x(t) without noise 
is the same as in Fig. [T] 



while the mean value /i of P(y) remains constant, as shown in 
the inset of Fig. QJ. The estimated functions £>i{y) and D 2 {y) 
change significantly, as shown in Fig.[T]} and[T]; respectively. 
Assuming Di{y) = d w + dny and D 2 (y) = d 20 + d 21 y + 
d 22 y 2 , Fig.[2]shows how the estimated parameters dij deviate 
from the 'true' uncontaminated values dij in Eq. (0 when 
measurement noise increases. Notice that for a = - see left 
vertical axis in the plots of Fig. [2]- the estimated parameter 
values are approximately correct. 

To correctly derive the drift and diffusion coefficients 
D\ (x) and D 2 (x) when a is strong, we consider the measured 
conditional moments Mi(yi, r) and M 2 (yi, r), as in Eq. (O, 




FIG. 3: Conditional moments Mi (j/i , t) and M2 (j/i , r ) as a function 
of bin for r = and different measurement noise strengths. The 
asymmetry of M2 is due to c?2i 7^ (see Eqs. 10). The same a;(i) as 
in Fig. Q] was used. 
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FIG. 4: Conditional moments (a) M\(yi,r) and (b) M2(jji,r) as 
a function of r, for bin yi = and different measurement noise 
strengths. In (c) one compares the true measurement noise with the 
approximation a app = M-z{Q, 0) ~ 2<r 2 given in Eq. J5J. In the 
inset the corresponding absolute and relative erros are given by C, a = 
\<J — a a pp \ and ( r = Ca/o" respectively. Errors for M2 are negligible. 
The same x(t) as in Fig. Q] was used. 



the hat indicating that they are calculated from the measured 
data y(t) directly. Since this conditional moments depend in 
a non trivial way on both time t and amplitude yi, we approx- 
imate them up to first order on r: 



Mi(vi 



(y(t + r)-y(t))\ y(t) ._ 



FIG. 5: Functions mi, rhi, 71 and 72 (symbols) defining the con- 
ditional moments in Eqs. The underlying Langevin time series 
x(t) without noise is characterized by a drift coefficient D\(x) = 
1 — x and a diffusion coefficient D2 (x) = 1 — x + x 2 . The measure- 
ment noise was fixed at a = 1. Each hat-function is compared with 
the corresponding integral form in Eqs. d 1 01 > using the first estimate 
of parameters values (dashed lines) and the true values (solid lines). 



Af 2 (»,T) 



rmxfoO+TlfoO + OtA (8a) 

{(y(t + T)-y(t)f)\ y{t)=yi 
Tm 2 { yi )+^ 2 { yi ) + G 2 + 0{t 2 ), (8b) 



where y(t) is taken in the range ± Ay/2 for each bin i, and 
Ay depends on the binning considered. AppendixlAl gives the 
full derivation of Eqs. ([8]). 

Figure [3] shows both conditional moments for t = and 
with different measurement noise strengths. Conversely, in 
Fig.Hk and@]3 one sees that the conditional moments depend 
linearly on r for a fixed amplitude y, which justifies the ap- 
proximation assumed in Eqs. ((H). Therefore, to study the de- 
pendence of the conditional moments on y we will consider 
the linear decompositions in Eqs. ([8]), as done in Fig. [5] Our 
simulations with synthetic data have shown that using a to 
large range of r values yields results for D\ and D 2 deviated 
from their true values. The best estimation for both Kramers- 
Moyal coefficients are obtained using the range 1 < r < 4. 

Notice that for sufficiently small measurement noise a good 
estimate of it is given bv lfl3l[l4ll 



(9) 



where \i is the average value of y(t) data points in the time 
series. For details see Append. [A] However, as shown in 
Fig. this approximation is no longer valid for sufficiently 
high measurement noise, namely when a > 0.5 (see inset of 
Fig. 2J:) and even otherwise coefficients D\ and D 2 are not 
correctly estimated (see Fig.|2|i. Therefore, a better algorithm 
to estimate such parameters is necessary. 

The heart of our procedure to correctly estimate measure- 
ment noise lies in the fact that while the functions m, and 7^ 
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FIG. 6: Function F in Eq. il lb as a function of (a) dio, (b) dn, (c) 
d,2o, (d) c?2i, (e) (I22 and (f) cr. The same situation as in Fig.[2]is here 
chosen: Di(x) = 1 — x, D2(x) — 1 — x + x 2 and a = 1. Dashed 
lines indicate the true values used for generating the data series, while 
the bullet indicates the estimated values of the Kramers-Moyal coef- 
ficients for cr = 0. In each plot while varying one parameter, the 
remaining ones are fixed at their true values (see text). 



(i=l,2) are obtained explicitly for each bin value j/j, functions 
rrii and 7, depend generally on the drift and diffusion coeffi- 
cients as follows: 



71 (y) 
72(2/) 

m 1 (y) 
m 2 (y) 



— OO 
+ OO 



OO 



(x - y)fa(x\y)dx 
(x - y) 2 I<r{x\y)dx 



(10a) 
(10b) 
(10c) 



r*+oo 

D 1 (x)f a {x\y)dx 

' — OC 

2 / [(x - y)D 1 (x) + D 2 (x)}f a (x\y)dx, 



(lOd) 



where fa{x\y) is the probability for the system to adopt the 
value x when a measured value y is observed. For details 
about the derivation of functions in Eqs. (TlOb see Append. lAl 
and for the explicit expression of f a (x\y) see Append. FBI 

In Fig. [5] we illustrate both the hat-functions in Eqs. dS) and 
their integral form in Eqs. ( [Tol l. Due to the measurement noise 
fixed in this example at a = 1 the hat-functions (symbols) are 
not properly fitted by the integral form in Eqs. (TlOb using the 
first estimate (dashed lines) of the parameters dij, taken from 
Fig. 12 and a, computed from Eq. (|9). If instead we use the 
true parameter values in the integral forms of our m, and 7; 
functions a proper fit is obtained (solid lines). 



Therefore, the problem we want to solve is to determine the 
parameters that minimize the function: 

"(71 " 7i (2/i)) 2 , 



f = ±y\^ 



\{yi) 

(72 - 72 (Vi) - o- 2 )' 
(mi - mi(^)) 2 
(m 2 - m 2 (jji)y 



(11) 



where the summation extends over all M bins, cr^j (yi) is the 
error associated to function 71 at the value and similarly 
for <x^ 2 , cr^j and <ta 2 - Notice that the values of such cr,^ and 
<jrhi are taken directly from the data only. See AppendixlAlfor 
details. 

Taking again the example illustrated in Fig. ® with a = 1 
we plot in Fig.|6]function F in Eq. (fTTT > as function of each one 
of the parameters keeping all others fixed at their true values. 
Evidently, the estimated values are near the minimum of F 
in each case. Further, the one-dimensional cuts of function F 
show only one minimum. One should note however that, for 
the entire 6-dimensional parameter space, several local min- 
ima of F may appear. In fact, after minimizing F by varying 
one parameter, function F also changes as a function of the 
other parameters, i.e. its minimum as a function of the other 
parameter changes. In the next Section we will see how to 
minimize function F, in order to find good estimates for the 
correct values for each parameter. 



III. OPTIMIZATION PROCEDURE 

After computing the functions 71, 72, rhi and m 2 as well 
as the corresponding errors , etc, directly from the mea- 
sured time series y(t) and estimating the coefficients D\ and 
D 2 given by the functional forms in Eqs. (0 there are several 
ways to minimize F. All of them start from the initially esti- 
mated set of values for the parameters and iteratively improve 
the solution, by finding lower values of F, till convergence is 
attained. 

To proceed the following remark should be considered. Pa- 
rameter din can be always eliminated with a simple transfor- 
mation x — > x' = x + dio/dn- Alternatively, and since 
we do not know beforehand the true values of d±o and du 
we can consider also the fact that averaging Eq. yields 
dio = —dix(x) and consider the transformation 2/ = x — (x). 
With these arguments, we henceforth disregard dio, which re- 
duces the dimension of parameter space by one. Parameter 
d\o is computed from the relations above, only after minimiz- 
ing F. For simplicity the primes in x' will be omitted. 

The simplest way is to minimize each term in F and re- 
peat that a large number of times starting from different initial 
conditions for the parameters, in a sort of a Monte Carlo pro- 
cedure of random walks lfl7ll or Levv-walks lfl8ll . The Monte 
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Carlo procedure assures that a substantial number of local 
minimal for F will be visited, and in the end we take the mini- 
mum of all F values found. Simulations have shown however 
that a Monte Carlo procedure is too expensive in this case, 
since there are different local minima and the choice of the 
minimum is strongly path dependent. We will therefore con- 



J 



sider the Levenberg-Marquardt method[ 15]. 

For the Levenberg-Marquardt procedure one computes the 
first and second derivative of F. Symbolizing the parameters 
cr, dn , d 2 o, d 2 i and CZ22 by pk with k = 1, . . . , 5 respectively, 
these derivatives read 



dF 
dpk 



M^la 2 

i=l 71 



7l ^71 



(i) dp k 



72 - 72 



9(72 + cr 2 ) rhi — mi dmi rh 2 — 1112 dm 2 



a 2 

72 



(i) 



dpk 



a 2 

m 2 



(i) dp k 



(12) 



d 2 F 
dpkdpi 



M 



97i c?7i 71 - 71 <9 2 7i 



M^lal(i)dp k dp e 



1 9(72 + cr 2 ) 5(72 + cr 2 ) 72 - 72 ~ cr 2 2 (72 + cr 2 ) 



dpi 



<4« 



dp k dpi 



+ 



1 <9mi 9mi 



_2_ \ - r 1 971 97i , 

M ^ Vol 



rhi — rrii d 2 mi 



1 drri2 dm 2 



m 2 



rri2 d 2 rri2 ' 



1 5( 7 2 + a 2 ) 3(72 + cr 2 ) 



(i) (9p fc cr? (i) <9p fc 



<9w 



1 dmi dmi 



1 <9to2 9m2 
1 (i) <9p fc Opt 



72 - 72 



(13) 



r 



In the right-hand side of Eq. (fT3l we neglect the terms con- 
taining second derivatives of 7 and m functions. This last 
approximation of neglecting second derivatives is acceptable 
as far as the model is successful fl5ll . 

By symbolizing first and second derivatives as (3k and aw 
respectively the iterative procedure computes the increments 
dpk for each parameter pk (k — 1, . . . , 5), which are the solu- 
tions of 



Pk 



(14) 



Furthermore, one assumes that dpi oc (3i, which considering 
dimensional analvsis lfl5ll can be written as: 



dpi = 



(15) 



where typically A > 1. For a given A value, instead of the 
second derivatives a m n one assumes a' mn = a mn (l + A) for 
m = n and a' mn = a mn otherwise and solves Eq. ( fT~4T > for 
dp k S. 

If F(pk + dpk) < F(pk), the parameter values are updated, 
Pk —> Pk + dpk, an d A is typically decreased by 10%. Other- 
wise, if F(pk + dpk) > F(pk) one increases A by 10% and 



determines new increments dpu- The procedure stops after 
attaining the required convergence. 

Using the same data as generated in Fig. [5] with a = 1, we 
now plot in Fig.|7]the functions rhi and 7, for the data (sym- 
bols) and compare them with the integral forms of those func- 
tions for the first estimate of parameter values (dashed lines) 
and the optimized solution obtained with the Levenberg- 
Marquardt procedure (solid lines). Clearly, the optimized 
functions fit better the data and the minimum of F found is 
very close to its true value (see caption of Fig.|7|i. 

Notice that the optimized values d' ik are obtained for the 
transformed data (x — ► x' = x — (x)), assuming d' w = 
0. In practice one obtains d' w ~ 10 -2 , typically two or- 
ders of magnitude smaller than the other coefficients. Using 
(x) = —dio/du, one obtains the true coefficients according 
to dm = -d'n(x), dn = d' n , d 20 = d' 20 - d' 21 (x) +d' 22 {x) 2 , 
d 2 i = d' 21 — 2d' 22 (x) and d 22 = d' 22 . 

To show the power of the present procedure we next gen- 
erate several synthetic data sets from Eq. (Q~|i with different 
measurement noise amplitudes cr/ in the range [0, 1.2]. The 
same Di(x) and D 2 {x) as in Fig.|2]is used. Results are shown 
in Fig. [8] The circles indicate the obtained parameter values 
for the first estimate, as in Fig. [2] The solid lines indicate the 
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FIG. 7: Functions mi, mi, 71 and 72 for the Langevin process with 
Di(x) = 1 — x, D2(x) = 1 — x + x 2 and a = 1. Symbols 
indicate the functions obtained for the data, dashed line corresponds 
to the first estimate of the parameters and solid line corresponds to the 
parameter values obtained from the Levenberg-Marquardt procedure 
(see text). In this case, for the first estimate one has Fq = 3720 
while the final estimate retrieves Flm — 33.1. The true minimum is 
Fm — 29.2. 



true values used to generate the data, while bullets indicate the 
value after optimization. 

From Fig.[8k one sees that after optimization the value of 07 
is always correctly determined. Such finding is of major im- 
portance and shows the relevance of our approach for practical 
applications even for strong measurement noise, since the un- 
contaminated series x typically lies within the range [—2,2], 
having therefore values close to the amplitude 07 of the mea- 
surement noise. 

Figures [8p and [8J: also show a very reliable estimate for 
the two parameters dio an d dn respectively, defining the drift 
coefficient Di(x). Since this coefficient characterizes the de- 
terministic part of the evolution equation for x, this accurate 
estimate should provide valuable insight into the dynamics of 
the underlying system. 

As for the diffusion coefficient D^{x), Figs.[8}i-f show that 
the estimate of (I22 is no longer as good as for the other param- 
eters. Parameter c^o is reasonably estimated but the optimized 
estimate is as good as the first one. 

For stronger measurement noise, namely for a > 1.2, one 
faces the problem that the optimization procedure is some- 
times stucked in a local minimum of the function F leading 
to unreliable coefficients dik- This is in principle a shortcom- 
ing of the presently used minimization algorithm. In addition, 
the function F itself is based on estimated functions m and 
7 and therefore itself subject to errors. A forthcoming study 
will address the observed issues in the context of global opti- 
mization. 
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FIG. 8: Comparison of the optimized parameters values (bullet) with 
the first estimate and the true values for different input measurement 
noise strengths 07: (a) 2a 2 , (b) dio, (c) dn, (d) (feo, (e) (fei, (f) 
^22- The measurement noise is correctly extracted as well as the 
parameters defining the drift coefficient D\(x) which controls the 
deterministic part of the underlying evolution equation (see text). 



IV. THE NORTH ATLANTIC OSCILLATION: AN 
EMPIRICAL EXAMPLE 

In this section we apply our framework to the North At- 
lantic Oscillation daily index, which presents data with strong 
measurement noise. Table J] summarizes the optimized values 
for all parameter describing the data set, comparing it with 
simulations. 

The North Atlantic Oscillation (NAO) is a source of vari- 
ability in the global atmosphere, describing a large-scale vac- 
illation in atmospheric mass between the anticyclone near the 
Azores and the cyclone near Iceland lfl9ll . The state of the 
NAO is usually measured by an index N, defined as the nor- 
malized pressure difference between the high and the low 
poles, where the pressures are averaged over each, day, month 
or year Jfl [l^]. The NAO index and climate indices in gen- 
eral are receiving much attention due to their important role in 
climate change. Lately, evidences for the stochasticity of this 
index have been shown|0[Ml- In this section we address the 
problem of estimating its measurement noise amplitude. 

Figures [9^ and|9p show the drift and diffusion coefficients 
respectively for the NAO daily index (bullets) and the corre- 
sponding fit (solid line). The parameters dij for both D\ and 
D2 are given in Tab.|T]together with the amplitude of the mea- 
surement noise. Probably due to the small amount of data 
points (16 801 values) one observes large scattering of the 
data, particularly away from the average value (N) ~ 0. 
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FIG. 9: (a)-(b) Estimate of the drift and diffusion coefficients D\ (N) 
and D 2 (N) of the daily North Atlantic Index 7VG3] (16801 dat- 
apoints), together with the corresponding (c) mi(N), (d) m,2(N), 
(e) 7i(A r ) and (f) 72 (AT). Results for the empirical NAO index are 
represented with bullets whereas the synthetic data also with 16801 
datapoints and parameter values given by Tab. U is shown with cir- 
cles for comparison. The corresponding fits are given with solid and 
dashed lines, respectively. 



To evaluate the reliability of considering the NAO index a 
Markov process described by Eq. (Q]i we also plot in Figs. [9] 
the results obtained when integrating such equation (circles) 
using the coefficient values in Tab. |T] including the amplitude 
of the measurement noise. The same sample size was consid- 
ered. The corresponding fit is represented with a dashed line. 
While the drift coefficient D\ resembles the one observed for 
the NAO index, there is a significant shift of the diffusion co- 
efficient, that only for a very narrow range around the average 
value is well reproduced. Indeed, as one sees from Tab. [I] the 
coefficient values for D2 in our simulation significantly devi- 
ate from the ones found for the NAO series. 

Further, functions mi and 7;, plotted in Figs. [9j>f, show 
also large scattering, particularly for 72. This feature raises 
difficulties in a proper minimum search for F. 

In order to check the reliability of the calculations we repro- 
duce the synthetic data 10 times and present in Tab.|I](column 
"With noise") the average values for each parameter, where 
the error is taken as the largest deviation from the average 
over the sample of data sets. The measurement noise, which 
dominates all parameters, is well reproduced. For the drift 
and diffusion coefficient the order of magnitude of each pa- 



Param. 


NAO Index 
(16801 pts) 


Simulations (16801 pts, 10 sim) 


With noise 


No noise 


Only noise 


O (Xl0~ 3 ) 


455 


455 ± 34 


106 ± 17 


321 ± 9 


dio (xio -3 ) 


-2.6 


-3.1 ±0.5 


-3.8 ±0.1 


10~ 6 ± 10~ 2 


dn (xio- 3 ) 


-40 


-24 ±9 


-29 ±2 


0.1 ± 1 


d20 (XIO" 3 ) 


39 


24 ±1 


29 ±0.1 


0.1 ± 1 


cbi (xio- 3 ) 


1.5 


-0.3 ± 1 


-1.6 ±0.2 


0.1 ± 1 


d22 (XIO" 3 ) 


16 


13 ±3 


11 ±0.5 


-1±7 



TABLE I: Optimized parameter values for the daily North Atlantic 
Oscillation daily index fl6ll compared with the average values for 10 
sets of synthetic data ("With noise") using the same number of points 
and parameter values. In order to evaluate the reliability of our syn- 
thetic data we also run the optimization procedure for 10 sets of syn- 
thetic data with the same D\ and D2 found in NAO series and a — 
("No noise"). In the last column we plot the results returned from the 
optimization procedure for synthetic data of pure measurement noise 
with amplitude a — 0.455, the one obtained for NAO series. 



rameter is also correct, but for d\i and and c^o one observes 
significant deviations from the estimated values obtained for 
the NAO series. 

This mismatch between the empirical and synthetic series 
could raise the question if the NAO Index is indeed suitably 
described by a Markovian stochastic process with a perceiv- 
able deterministic part. In fact, since one observes a ^> d%j 
the series is approximately a pure white noise (i.e. y(t) = 
<r((t) in Eq. (O), which in fact also yields a linear drift and 
quadratic diffusion coefficients. 

To address this problem we rerun our optimization pro- 
cedure for synthetic data, for two additional situations, one 
where a = and drift and diffusion coefficients are given by 
the NAO index, and another one which simulates a pure white 
noise ( D\ = D2 = 0) with a equal to the value found for the 
NAO series. The results are also given in Tab.|H columns "No 
Noise" and "Only noise" respectively. 

For the pure white noise process one obtains a as the only 
non-zero parameter, apart fluctuations, but with an amplitude 
different from the one used to generate the synthetic data, 
namely a ~ 0.321, which corresponds to ~ 75% of the in- 
serted measurement noise (a = 0.445). For the synthetic pro- 
cess with no noise, the order of magnitude for the parameters 
of Di and D2 is correctly computed, whereas a non-zero mea- 
surement noise is retrieved covering the remaining 25% of the 
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inserted measurement noise. In other words, one can argue 
that in this situation our procedure retrieves ~ 75% of the to- 
tal amount of measurement noise. 

In this scope, our results point in the direction of previ- 
ous arguments given by some authors l20ll : differently from 
other climate indices such as the ENSO index, the NAO index 
seems to be an almost pure white noise process with only a 
minor contribution from a stochastic process governed by a 
Langevin-like equation. Alternative indices should be there- 
fore considered and studied as recently suggested|8|. 

V. DISCUSSION AND CONCLUSIONS 

We described in detail a nonparametric procedure to extract 
measurement noise in empirical stochastic series with strong 
measurement noise. The algorithm is able to accurately ex- 
tract the strength of measurement noise and the values of the 
parameters defining the drift coefficient and to estimate with 
good accuracy the diffusion coefficient that fully describe the 
evolution equation for the measured quantity in the time se- 
ries. This has been shown by synthetically generated data sets 
contaminated by increasing measurement noise. Additionally, 
the algorithm was applied to a set of measured data providing 
new insight in the underlying systems. The data for the cli- 
mate index shows a large scattering, probably due to the small 
amount of data points. Larger data sets for climate indices are 
not available up to our knowledge. 

It should be noticed that the nonparametric reconstruction 
of the Langevin Eq. (HJ from measured stationary data sets 
generally requires that the process exhibits Markovian proper- 
ties and fulfils the Pawula theorem JH. While the second con- 
straint can be relaxed extending the analysis to a broader class 
of Langevin-like systems in which the Gaussian ^-correlated 
white noise Langevin force is replaced by a more general Levy 
noise 1I2I I2II1 . in general the Markov condition remains a cru- 
cial constraint. 

Recently, it has been shown that processes corrupted from 
measurement noise may loose their Markov properties ll22ll . 
For this reason the proper analysis of data suffering from 
strong measurement noise in general is a complicated task. 
We, however, would like to point out, that the method pre- 
sented here solely relies on Markov properties of the under- 
lying, undisturbed process x(t). In case of ^-correlated mea- 
surement noise the method presents a general approach to ac- 
cess the process x and the noise amplitude a at the same time. 

Therefore, since the algorithm is general for a broad class 
of stochastic systems other applications can be proposed. Par- 
ticularly in cases where the measurement procedure is subject 
to large measurement noise due to the distance between the lo- 
cation where the measure is taken and the location where the 
phenomena occurs. Two important applications in this con- 



text are seismographic data|01, where the epicenter can not 
be predicted before-hand, and data from surface EEGll9l [l0ll . 
which, though having stronger measurement noise, are much 
recommended instead of insitu measurements for the sake and 
comfort of the patient. A further application would be the 
analysis of sensors to which one has no access, for example 
sensors being installed in remote systems showing more and 
more measurement noise due to aging effects. Here it should 
even be possible to know quite precisely the functional struc- 
ture of the underlying process, an assumption of our analysis 
here. 

Such applications however appeal for the extension of the 
present procedures to higher dimensions, i.e. more than one 
time-series, which implies the consideration of different mea- 
surement noise sources and consequently noise mixing. To 
ascertain in which conditions and up to which point can we 
separate different measurement noise sources is an open ques- 
tion which we will address elsewhere. 

In all simulations a linear function was assumed for the drift 
coefficient and a quadratic one for diffusion. Although such 
assumptions comprehend already a broad class of systems |0, 
[H [3] our approach and all expressions may easily be ex- 
tended to higher order polynomials for D\{x) and D^{x), as 
long as the number of parameters for modelling D\{x) and 
D2 (x) is not too high. In this case the calculations presented 
in the appendices are valid if one considers proper higher pow- 
ers in the integrand of integrals hi and hi (see Eqs. dC12b in 
Append. |C]l- 

Furthermore, other possibilities for optimization are pos- 
sible. For instance, though in this case we have shown that 
random Monte Carlo procedures are computationally expen- 
sive consuming, one could think of a non-local search pro- 
cedure using for example bigger jumps such as the ones of 
a Levy flight process llZl . Alternatively one may also study 
how good would be an optimization procedure that considers 
the minimization of a splitted cost function F. Preliminary 
results have shown that for a proper decomposition of F our 
optimization problem may be reduced to a cubic equation and 
a lower dimensional system of linear equations. Another pos- 
sibility would be to use genetic algorithms !^! . These points 
will be addressed elsewhere. 
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APPENDIX A: THE CONDITIONAL MOMENTS OF AN ARBITRARY TIME SERIES AND THEIR LINEAR 

APPROXIMATIONS 

Taking a series of measurements y{t) as defined in Eq. yj, its n-th order conditional moment reads 

M n (y ,T) = ((y(t + r)-y(t)) n )\ y{t)=yo 



dx / dx dy(y - y ) n f a (y\x)f T (x\x )fa{xo\yo), (Al) 

J — oo J — oo 

where f a (y\x) is the probability to measure y in the presence of a measurement noise with variance er 2 , when the system (without 
noise) has the value x, / T (x|xo) is the probability for the system to evolve from a value xo to a value x within a time interval r 
and f<j(xo\yo) has the inverse meaning of f a : it is the probability for the system to adopt the value xo when a measured value 
yo is observed. While f T is unknown, f a and f a are related with each other according to Bayes' theorem (see App.lBl. 
From such assumptions one easily arrives to the identities 

+00 

dyf*(y\x) = 1, (A2a) 

J — OO 

dy(y-x)f a (y\x) = 0, (A2b) 

-OO 
+ OO 

dy(y-x) 2 U(y\x) = a 2 , (A2c) 

-00 

and using these identities the general expression (IAU can be approximated up to first order assuming r <C 1. More precisely, 
the first two moments M\ and M2 yield 

Mi(y 0) r) = (y(t + t) - y{t))\ y{t)=yo 

/ + OO Z' + OO 

dx \ dy(y - y a )f a (y\x)f T (x\x )fa(xo\yo), 
-OO J — OO 

/+00 /*+oc 
dx/ T (a;|xo)/ (T (a;o|yo) / dy(y - x + x - y Q )f a (y\x) 
-OO J — 00 

+ OO 



dx / drfr | ) /<r (^o 1 2/0 )>< 

«/ — OO 

+00 /*+oo 

dy(x-y )fcr{y\x) + dy(y ~ x)U{y\x) 



/+00 / r+00 

dxf T (x\xo)fv(x \yo) x Ux-yo) / dyf a (y\x) + 
-00 \ J —00 

r+00 

dx f a (x \yo) / da;(x - x + x - yo)/r(a;|a;o) 



— OO 

+ OO 



dx f a (x Q \y )x 

dx(x ~ yo)fr(x\x ) + I dx(x ~ x )f T (x\x ) 



dx f a (xo\y Q ) ((x - y ) + tD^xq) + 0(t 2 )) 

) 

TO /" + 0O 

dx (x ~ yo)fcr(xo\y ) + t / dxof^xo)^^!^) + C(r 2 ) 

) J —oc 

7 1 (yo)+rTO 1 (y )+0(r 2 ), (A3) 



M 2 (y ,r) = ((y(i + r)-y(i)) 2 >|,( t)=!/0 

« + OO {• + OC' 



dx / dx / dy(y-y ) fa(y\x)f T (x\x )f a (x \y ) 
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+ OO /> + OG 

dx / dxf T (x\xo)fv(x \yo) I dy{y - y ) 2 f a (y\x) 



dx Q dxf T (x\x )fa(x \y ) dy(y - x + x - y Q ) 2 f a (y\x) 

) J — oo J — oo 

Xj Z' + OG 

dx / dxf T (x\x )fcr(xo\yo)x 
i J — oo 

r+oc r+oc 

( / dy(y - x) 2 f a {y\x) + 2(x - y ) / dy(y - x)f a (y\x)+ 

J—oo J — oo 

r+oo 

{x - yo) 2 / dyf a (y\x)\ 

J — OO 

>o r+oo 

dx / dx/ T (x|xo)/<r(xo|yo) >< (er 2 + + (x - y ) 2 ) 

! «/ — OO 

yo r+oo 

dx /cr(xo|yo) / <ix(cr 2 + (x - yo) 2 )/r(x|x ) 

! J—OO 

yo r+oo 

dxo/cr(xo|yo) / dx(o- 2 + (x - x + x - yo) 2 )/r(x|x ) 

! •/ — OO 

yo 

dx f a (x \y )x 

i 

r + oo r+oo 

x( dx(x - x ) 2 /r(x|x ) + 2(x - y ) / dx(x - x )/ T (x|x ) 

J—oo J — oo 

/•+oo 

(a 2 + (x - y ) 2 ) / dx/ T (x|x )J 



+ 00 



dxo^2T£) 2 (xo) + 2(x - yo)T£>i(x ) + 
^ + (^o-yo) 2 )/ CT ^o|yo) + 0(T 2 ) 



= 2r / dx f £>2(x ) + (x - yo)£ , i(xo)j/ -(x |yo) + 

— oo 

/+oo 
<Mx - yo) 2 /^o|yo) + 0{t 2 ) 
-OO 

= rm 2 (yo) + <r 2 + j 2 (ya) +0{t 2 ). (A4) 

From Eg. (IA4t one has M(vn. 0) = a 2 + 72(yo) where 72(yo) = J^°° dxo(xo — yo) 2 f a (xo\yo). Such observations justify the 
first estimate for the measurement noise stated in Eq. ©, since when a is small enough, probability density function / CT (xo|yo) 
is similar to /^(yol^o) ( see Eq. ( |A2at ) and therefore, one can take as a first approximation 72(yo) ~ c 2 . 

Notice that the last equalities in Mi and M2 yield first order approximations under the assumption that r <C 1. In Ref. Hill 
another approach is proposed for the estimation of drift and diffusion coefficients in the case of low sampling rates. 

The errors for 71 (yo), 72(yo), wi(yo) and rn.2(yo) are just given from the linear fit of Mi and A/2 for each fixed yo, given in 
Eqs. (I5at and (IFbb . The errors of Mi(y, r) and A'hiy, r) can be also directly computed from the data as 

°Af>.T) = ([(y(i + ^)-2/W)-(2/(^ + ^-2/W)] 2 )tG{t 1 ,...,*„} 

= ((*(* + r) - y{t)f + M 2 (y , r) - 2Mi(j/ , r)(y(t + r) - y{t))) 

= ^ (M 2 (yo, r) + M 2 (y , t) - 2Af 2 ( 2/o , r)^ 

_ Ma^^-Mf^r) 



(A5a) 



<4>t) = ( [(y(t + r) y{t)f ((y(t + r) y{t)) 2 )} 2 ) t£{tl ,..., tn} 

= ((y(t +r) - y(t)) 4 + Af 2 (y , r) - 2M 2 (y , r)(y(t + r) - y(t)) 2 ) 
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^ (m 4 (2/ , t) + M|(y , r) - 2M 2 2 (j/ ,r) 
M 4 (?/,T)-A/ 2 2 (y,r) 



(A5b) 



where A y is the number of data points in bin y. 

For the optimization procedure it is convenient to simplify the expressions for functions m t and 7, (i = 1, 2). Namely, mi 
and TO2 can be written as expressions of 71 and 72. In fact, substituting Eqs. d7ab and dTbl into Eqs. dlOcb and dlOdb . and adding 
and subtracting properly y, yields 

/+00 
Di(x)f a {x\y)dx 
-OO 

(d xo + dna:)/ er (a;|y)da; 



+00 



dxo / f a {x\y)dx + d n 

J — OO 

dio + du(y + 71(2/)) 

f +00 



(x-y)f<r(x\y)dx + d n y f a (x\y)dx 



(A6a) 



m 2 {y) 



[{x -y)D l {x) + D 2 {x)]U{x\y)dx 

-OO 

+00 

[(a; - y){d w + dnx) + d 20 + d 2X x + d 22 x 2 ]f a {x\y)dx 

{x - y) {dxo + dxx{x-y + y)) + d 2Q + 

d 2 i(x - y + y) + d 22 (x - y + y) 2 f a {x\y)dx 



—00 
+00 



2[7i(2/)dio + (72(2/) + yii{y))dn +d 20 + (7i(y) +y)d 21 + (2y7i(y) + 72(2/) +y 2 )d 22 ] 



(A6b) 



Substituting Eqs. (|A6al > and ( |A6bb into Eq. ( fTTT i yields F as a functional depending only on the integrals 71 (y) and 72(2/) 
defined in Eqs. ( II Oat and ( llObl i. apart the six parameters, a and <i,-fc, we want to optimize. 



APPENDIX B: THE PROBABILITY DENSITY FUNCTION U{x\y) 

To solve the minimization problem we will need to explicitly write expressions for f a {x\y). This conditional probability 
density function appears in Eqs. (II Oat and (llObl i and according to the Bayes theorem is given by: 



U{y\x)p{x) 

rr fAy\x')p{x')dx' 



My) = r+oo , , , . ; . ( fi D 



where /<r($/|x) is the probability density function of the measurement noise a( t , i-e. a Gaussian function centered at y with 



variance a 2 , 



f a {y\x) = -^=Q-^, (B2) 

(TV 27T 



and can be written, assuming that the process is stationary, as 



*> " J^) e * w (B3) 



where A/" is some normalized function such that p{x)dx = 1 and 



,W) dx> - (B4) 
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For an Ornstein-Uhlenbeck process Di(x) = dio + dux and D 2 (x) = d2o one finds 

Pou(x) z 



d u 



i d n /- i HO v-> 
Q2 d 20 ^ x -t-d 11 ) 



2d ? nn 



(B5) 



from which one easily sees that du < is a necessary condition to have a well-defined probability density function p(x). 

For the general case given by Eqs. (0 one has typically D 2 (x) > with d 22 > 0, which yields A = 4d 2 od 2 2 — d\ x > 0. In 
these situations, p(x) can also be integrated, yielding 



with 



p G (x) =Af(D 2 (x))^- 1 Q idl °-^ L)ho{x) , 
( 2d 22 x + d 2 i 



h (x) 



arctan 



A 



(B6) 



(B7) 



APPENDIX C: THE DERIVATIVES OF 71, 72, mi AND m 2 

The minimization problem needs also the expression of the derivatives for the 7's and m's. To compute them one needs first 
to write the derivatives of function f a (x\y) defined in Eq. ( IBU . 
Defining g(x, y) = f a {y\x)p(x) one has in general 

dfMv) _ ^f-™9(x>,y)dx>-gf^§^^ 



dX 



where X is some variable on which f a depends. Since p(x) depends only on parameters dij and f cr (y\x) depends only on a, we 
have 

dg(^y) = dJMxl 

OCT OO 

dg(x,y) dp(x) 

= -dd- fAylx) (C2b) 
where for f a (y\x) we have 

^^ = /,(y|*)^# (C3) 

da o i 



and for p(x) we have 



with 



and therefore 



with X one of the d parameters. 
In the Ornstein-Uhlenbeck case 



dpou(x) 
dd w 


= h x 


du 


)pou{x) 


dpou{x) 


= ( 1 




dio )+ 1 \ 


dd n 


\2d 20 


d 2 11 ) ^2d 11 ) 


dpou{x) 


1 


(- 


du ( d w 
d 20 \ dn 


dd 20 


2d 2Q 



D 2 (x) 

M= ( [ p{x)dx] (C5) 



8p(x) , f fdp(x) f +00 dp(x') , 

-dx- =M [-dx-- p{x) I ~dx- dx ' (C6) 



(C7a) 

Pou{x) (C7b) 
I Pou{x) (C7c) 
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and in the general case 

dp G (x) 



dd w 

ddu 
dpc{x) 

dd 20 
dpgjx) 

dd 21 
dpg(x) 

dd 



h (x)pG(x) 
1 



2d 22 
1 

D 2 (x) 
x 



\ogD 2 {x) 



+ 



dh (x) 



dd 20 
dn 



2d 22 
Pg(x) 



ho(x) pg(x) 



22 



D 2 ( X ) 2d 2 2 
/ d 



h (x) + d w 



d 2 \di 
2d 



22 



11 



\2d 2 



22 



'11 



log£>a(x) 



where 



dh Q (x) 
dd 20 

dh (x) 
dd 2 i 

dh (x) 



2dn 



D 2 (x) 2d\ 2 

M 22 (2d 22 x + "21 ) 

A(A + (2d 22 x + d 21 ) 2 ) 

(hi, ( 2 A + d 21 (2d 22 x + d 21 ) 

A °^ x ' A A + (2d 22 x + d 21 ) 2 

2d 20 4 xA - d 20 (2d 22 x + d 21 ) 

-h (x) + 



dh (x) 
dd 21 

d 2 idn 
2d 22 



Pg(x) 



h (x) + d 1Q 



d 9 id 



21"11 



2d 



22 



dhpjx) 
dd 22 



Pg(x) 



■h (x) 



dd 22 A ' A A + (2d 22 x + d 21 ) 2 ' 

So, neglecting the parameter dio as explained in Sec. [Ill] for the other parameters a, dn, d 2 o, d 2 \, d 22 we have 

dJMv) - ^ [(x - y) 2 - i2(y)] U(x\y) 



da 
dh{x\y) 



(*-»)• 

e 



Oil, 



/_r 9(x\ y)dx' 



(C8a) 
(C8b) 
(C8c) 
(C8d) 
(C8e) 

(C9a) 
(C9b) 
(C9c) 

(ClOa) 
(CI Ob) 



and therefore considering Eqs. ( 11 Oat and fllObt that define functions 71 (y) and j 2 (y) and also Eqs. ( |A6at and dA6b| ) defining 



functions toi 

(y 



3a 
972(2/ 



9a 
dmi(y 



da 
dm 2 {y 



da 

dn (y 



ddij 
972(2/ 



dd^ 
dm\(y 



ddn 
dmxiy 



dd 2j 
dm 2 (y 



y) and m 2 (y) it follows 



1 



= -r [hi(y) - 7i(y)72(y)] 



= -[h2(y)-j 2 2 (y)} 



= d 



ii" 



da 



2 ( (d 2i + y(d n + 2d 22 ))^M + ( du + d22 ) 9 ' 



9<T 



da 



+ ~ {x ,_ v fjmdx> 



{x' y) 



y + 71 (y) + d 



dd^ 

'■ d f°( x '\y) dx > 

ddij 

9-yijy) 

ddn 



11 ■ 



in- 



dn(y) 

dd 



ddi 



2 ( (c?2i + y(dn + 2d 22 )) d 'J 1 } y ^ + (dn + d 22 ) dl2 j V ^ + j 2 (y) + 1/71(1/) 



ddi 



dd! 



(CI la) 
(CI lb) 
(Cllc) 
(CI Id) 
(Clle) 
(CI If) 
(Cllg) 
(Cllh) 
(Clli) 
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- 2 (a 2 i + y(d n + 2d 22 )) — h(dii+d 22 )^ hi (Cllj) 



dd 20 V ^20 od 2 o 

dm2{v) 2(( d21+ y( dll+ 2d 22 )) d ^ + ( dll+ d 22 ) d -^l + 



od 2 \ \ od 2 \ ad 2 \ 

dm 2 {y) _ „( fA , „ yj , OJ ^9 7 i(y) , ^ , A ^ 2 {y) , 



2 (d 21 + y(d n + 2d 22 ))^^ + (d n + d 22 )^^ + 2y 7l (y) + j 2 (y) + y 2 (Clll) 
od 22 \ ad 22 od 22 ) 

where 

/+oo 
(x'-^/^x'ly)^' (C12a) 
-OO 
/' + OC 

h 2 {y) = / (x'-y) 4 /,^'^)^'. (C12b) 
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